13 research outputs found

    Machine learning approaches to video activity recognition: from computer vision to signal processing

    Get PDF
    244 p.La investigación presentada se centra en técnicas de clasificación para dos tareas diferentes, aunque relacionadas, de tal forma que la segunda puede ser considerada parte de la primera: el reconocimiento de acciones humanas en vídeos y el reconocimiento de lengua de signos.En la primera parte, la hipótesis de partida es que la transformación de las señales de un vídeo mediante el algoritmo de Patrones Espaciales Comunes (CSP por sus siglas en inglés, comúnmente utilizado en sistemas de Electroencefalografía) puede dar lugar a nuevas características que serán útiles para la posterior clasificación de los vídeos mediante clasificadores supervisados. Se han realizado diferentes experimentos en varias bases de datos, incluyendo una creada durante esta investigación desde el punto de vista de un robot humanoide, con la intención de implementar el sistema de reconocimiento desarrollado para mejorar la interacción humano-robot.En la segunda parte, las técnicas desarrolladas anteriormente se han aplicado al reconocimiento de lengua de signos, pero además de ello se propone un método basado en la descomposición de los signos para realizar el reconocimiento de los mismos, añadiendo la posibilidad de una mejor explicabilidad. El objetivo final es desarrollar un tutor de lengua de signos capaz de guiar a los usuarios en el proceso de aprendizaje, dándoles a conocer los errores que cometen y el motivo de dichos errores

    Deep Learning eredu batean oinarritutako irudien analisi eta azterketaren prototipo orokorgarri baten garapena: Nvidia Jetson TX1 plataforman inplementatuta

    Get PDF
    [ES] En este proyecto se ha trabajado con diferentes métodos utilizados en visión por computador con el objetivo de usarlos para la clasificación de imágenes. Para ello, se ha llevado a cabo un pre-procesado de imágenes y se han desarrollado diferentes modelos de clasificación. Concretamente se han desarrollado y probado diferentes técnicas de machine learning y deep learning, analizando los resultados conseguidos en la clasificación de unas imágenes. También se ha utilizado un software que ofrece NVIDIA, DIGITS, para hacer diferentes pruebas. Aparte de eso, se ha trabajado con diferente hardware utilizado para el deep learning, como Jetson TX1, con el fin de tener un primer contacto con las tecnologías utilizadas en los robots móviles.[EU] Proiektu honetan, konputagailu bidezko ikusmenean erabilitako metodoak landu dira irudien klasifikaziorako erabiltzeko helburuarekin. Horretarako, irudien aurreprozesaketa eta eredu desberdinen garapena gauzatu da. Zehazki, machine learning eta deep learning teknika desberdinak garatu eta probatu dira, irudi jakin batzuen klasifikazioan lortutako emaitzak aztertuz. NVIDIA-k eskainitako software bat, DIGITS, ere erabili da proba desberdinak egiteko. Horrez gain, deep learning-erako erabiltzen den hardware desberdinarekin lan egin da, hala nola Jetson TX1, robot mugikorretan erabiltzen den teknologiarekin lehenengo kontaktu bat edukitzeko.[EN] In this project we have worked with different methods used in computer vision with the objective of using them for the classification of images. For that, a pre-processing of images has been carried out and different classification models have been developed. Specifically, different techniques of machine learning and deep learning have been developed and tested, analyzing the results obtained in the classification of some images. It has also been used a software offered by NVIDIA, DIGITS, to do some more tests. Apart from that, we have worked with different hardware used for deep learning, such as Jetson TX1, in order to have a first contact with the technologies used in mobile robots

    Deep Learning eredu batean oinarritutako irudien analisi eta azterketaren prototipo orokorgarri baten garapena: Nvidia Jetson TX1 plataforman inplementatuta

    Get PDF
    [ES] En este proyecto se ha trabajado con diferentes métodos utilizados en visión por computador con el objetivo de usarlos para la clasificación de imágenes. Para ello, se ha llevado a cabo un pre-procesado de imágenes y se han desarrollado diferentes modelos de clasificación. Concretamente se han desarrollado y probado diferentes técnicas de machine learning y deep learning, analizando los resultados conseguidos en la clasificación de unas imágenes. También se ha utilizado un software que ofrece NVIDIA, DIGITS, para hacer diferentes pruebas. Aparte de eso, se ha trabajado con diferente hardware utilizado para el deep learning, como Jetson TX1, con el fin de tener un primer contacto con las tecnologías utilizadas en los robots móviles.[EU] Proiektu honetan, konputagailu bidezko ikusmenean erabilitako metodoak landu dira irudien klasifikaziorako erabiltzeko helburuarekin. Horretarako, irudien aurreprozesaketa eta eredu desberdinen garapena gauzatu da. Zehazki, machine learning eta deep learning teknika desberdinak garatu eta probatu dira, irudi jakin batzuen klasifikazioan lortutako emaitzak aztertuz. NVIDIA-k eskainitako software bat, DIGITS, ere erabili da proba desberdinak egiteko. Horrez gain, deep learning-erako erabiltzen den hardware desberdinarekin lan egin da, hala nola Jetson TX1, robot mugikorretan erabiltzen den teknologiarekin lehenengo kontaktu bat edukitzeko.[EN] In this project we have worked with different methods used in computer vision with the objective of using them for the classification of images. For that, a pre-processing of images has been carried out and different classification models have been developed. Specifically, different techniques of machine learning and deep learning have been developed and tested, analyzing the results obtained in the classification of some images. It has also been used a software offered by NVIDIA, DIGITS, to do some more tests. Apart from that, we have worked with different hardware used for deep learning, such as Jetson TX1, in order to have a first contact with the technologies used in mobile robots

    HAKA: HierArchical Knowledge Acquisition in a sign language tutor

    Get PDF
    Communication between people from different communities can sometimes be hampered by the lack of knowledge of each other's language. A large number of people needs to learn a language in order to ensure a fluid communication or want to do it just out of intellectual curiosity. To assist language learners' needs tutor tools have been developed. In this paper we present a tutor for learning the basic 42 hand configurations of the Spanish Sign Language, as well as more than one hundred of common words. This tutor registers the user image from an off-the-shelf webcam and challenges her to perform the hand configuration she chooses to practice. The system looks for the configuration, out of the 42 in its database, closest to the configuration performed by the user, and shows it to her, to help her to improve through knowledge of her errors in real time. The similarities between configurations are computed using Procrustes analysis. A table with the most frequent mistakes is also recorded and available to the user. The user may advance to choose a word and practice the hand configurations needed for that word. Sign languages have been historically neglected and deaf people still face important challenges in their daily activities. This research is a first step in the development of a Spanish Sign Language tutor and the tool is available as open source. A multidimensional scaling analysis of the clustering of the 42 hand configurations induced by Procrustes similarity is also presented.This work has been partially funded by the Basque Government, Spain, under Grant number IT1427-22; the Spanish Ministry of Science (MCIU), the State Research Agency (AEI), the European Regional Development Fund (FEDER), under Grant number PID2021-122402OB-C21 (MCIU/AEI/FEDER, UE); and the Spanish Ministry of Science, Innovation and Universities, under Grant FPU18/04737. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research

    RANSAC for Robotic Applications: A Survey

    Get PDF
    Random Sample Consensus, most commonly abbreviated as RANSAC, is a robust estimation method for the parameters of a model contaminated by a sizable percentage of outliers. In its simplest form, the process starts with a sampling of the minimum data needed to perform an estimation, followed by an evaluation of its adequacy, and further repetitions of this process until some stopping criterion is met. Multiple variants have been proposed in which this workflow is modified, typically tweaking one or several of these steps for improvements in computing time or the quality of the estimation of the parameters. RANSAC is widely applied in the field of robotics, for example, for finding geometric shapes (planes, cylinders, spheres, etc.) in cloud points or for estimating the best transformation between different camera views. In this paper, we present a review of the current state of the art of RANSAC family methods with a special interest in applications in robotics.This work has been partially funded by the Basque Government, Spain, under Research Teams Grant number IT1427-22 and under ELKARTEK LANVERSO Grant number KK-2022/00065; the Spanish Ministry of Science (MCIU), the State Research Agency (AEI), the European Regional Development Fund (FEDER), under Grant number PID2021-122402OB-C21 (MCIU/AEI/FEDER, UE); and the Spanish Ministry of Science, Innovation and Universities, under Grant FPU18/04737

    Using Common Spatial Patterns to Select Relevant Pixels for Video Activity Recognition

    Get PDF
    first_page settings Open AccessArticle Using Common Spatial Patterns to Select Relevant Pixels for Video Activity Recognition by Itsaso Rodríguez-Moreno * [OrcID] , José María Martínez-Otzeta [OrcID] , Basilio Sierra [OrcID] , Itziar Irigoien , Igor Rodriguez-Rodriguez and Izaro Goienetxea [OrcID] Department of Computer Science and Artificial Intelligence, University of the Basque Country, Manuel Lardizabal 1, 20018 Donostia-San Sebastián, Spain * Author to whom correspondence should be addressed. Appl. Sci. 2020, 10(22), 8075; https://doi.org/10.3390/app10228075 Received: 1 October 2020 / Revised: 30 October 2020 / Accepted: 11 November 2020 / Published: 14 November 2020 (This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ) Download PDF Browse Figures Abstract Video activity recognition, despite being an emerging task, has been the subject of important research due to the importance of its everyday applications. Video camera surveillance could benefit greatly from advances in this field. In the area of robotics, the tasks of autonomous navigation or social interaction could also take advantage of the knowledge extracted from live video recording. In this paper, a new approach for video action recognition is presented. The new technique consists of introducing a method, which is usually used in Brain Computer Interface (BCI) for electroencephalography (EEG) systems, and adapting it to this problem. After describing the technique, achieved results are shown and a comparison with another method is carried out to analyze the performance of our new approach.This work has been partially funded by the Basque Government, Research Teams grant number IT900-16, ELKARTEK 3KIA project KK-2020/00049, and the Spanish Ministry of Science (MCIU), the State Research Agency (AEI), and the European Regional Development Fund (FEDER), grant number RTI2018-093337-B-I100 (MCIU/AEI/FEDER, UE). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research

    Deep Learning eredu batean oinarritutako irudien analisi eta azterketaren prototipo orokorgarri baten garapena: Nvidia Jetson TX1 plataforman inplementatuta

    No full text
    [ES] En este proyecto se ha trabajado con diferentes métodos utilizados en visión por computador con el objetivo de usarlos para la clasificación de imágenes. Para ello, se ha llevado a cabo un pre-procesado de imágenes y se han desarrollado diferentes modelos de clasificación. Concretamente se han desarrollado y probado diferentes técnicas de machine learning y deep learning, analizando los resultados conseguidos en la clasificación de unas imágenes. También se ha utilizado un software que ofrece NVIDIA, DIGITS, para hacer diferentes pruebas. Aparte de eso, se ha trabajado con diferente hardware utilizado para el deep learning, como Jetson TX1, con el fin de tener un primer contacto con las tecnologías utilizadas en los robots móviles.[EU] Proiektu honetan, konputagailu bidezko ikusmenean erabilitako metodoak landu dira irudien klasifikaziorako erabiltzeko helburuarekin. Horretarako, irudien aurreprozesaketa eta eredu desberdinen garapena gauzatu da. Zehazki, machine learning eta deep learning teknika desberdinak garatu eta probatu dira, irudi jakin batzuen klasifikazioan lortutako emaitzak aztertuz. NVIDIA-k eskainitako software bat, DIGITS, ere erabili da proba desberdinak egiteko. Horrez gain, deep learning-erako erabiltzen den hardware desberdinarekin lan egin da, hala nola Jetson TX1, robot mugikorretan erabiltzen den teknologiarekin lehenengo kontaktu bat edukitzeko.[EN] In this project we have worked with different methods used in computer vision with the objective of using them for the classification of images. For that, a pre-processing of images has been carried out and different classification models have been developed. Specifically, different techniques of machine learning and deep learning have been developed and tested, analyzing the results obtained in the classification of some images. It has also been used a software offered by NVIDIA, DIGITS, to do some more tests. Apart from that, we have worked with different hardware used for deep learning, such as Jetson TX1, in order to have a first contact with the technologies used in mobile robots

    RANSAC for Robotic Applications: A Survey

    No full text
    Random Sample Consensus, most commonly abbreviated as RANSAC, is a robust estimation method for the parameters of a model contaminated by a sizable percentage of outliers. In its simplest form, the process starts with a sampling of the minimum data needed to perform an estimation, followed by an evaluation of its adequacy, and further repetitions of this process until some stopping criterion is met. Multiple variants have been proposed in which this workflow is modified, typically tweaking one or several of these steps for improvements in computing time or the quality of the estimation of the parameters. RANSAC is widely applied in the field of robotics, for example, for finding geometric shapes (planes, cylinders, spheres, etc.) in cloud points or for estimating the best transformation between different camera views. In this paper, we present a review of the current state of the art of RANSAC family methods with a special interest in applications in robotics

    Video Activity Recognition: State-of-the-Art

    Get PDF
    This article belongs to the Section Physical Sensors.Video activity recognition, although being an emerging task, has been the subject of important research efforts due to the importance of its everyday applications. Surveillance by video cameras could benefit greatly by advances in this field. In the area of robotics, the tasks of autonomous navigation or social interaction could also take advantage of the knowledge extracted from live video recording. The aim of this paper is to survey the state-of-the-art techniques for video activity recognition while at the same time mentioning other techniques used for the same task that the research community has known for several years. For each of the analyzed methods, its contribution over previous works and the proposed approach performance are discussed.This work has been partially supported by the Basque Government, Spain (IT900-16), the Spanish Ministry of Economy and Competitiveness (RTI2018-093337-B-I00, MINECO/FEDER, EU)

    Shedding Light on People Action Recognition in Social Robotics by Means of Common Spatial Patterns

    No full text
    Action recognition in robotics is a research field that has gained momentum in recent years. In this work, a video activity recognition method is presented, which has the ultimate goal of endowing a robot with action recognition capabilities for a more natural social interaction. The application of Common Spatial Patterns (CSP), a signal processing approach widely used in electroencephalography (EEG), is presented in a novel manner to be used in activity recognition in videos taken by a humanoid robot. A sequence of skeleton data is considered as a multidimensional signal and filtered according to the CSP algorithm. Then, characteristics extracted from these filtered data are used as features for a classifier. A database with 46 individuals performing six different actions has been created to test the proposed method. The CSP-based method along with a Linear Discriminant Analysis (LDA) classifier has been compared to a Long Short-Term Memory (LSTM) neural network, showing that the former obtains similar or better results than the latter, while being simpler
    corecore